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ABSTRACT Allelic association provides a means to map 
disease genes that, in a dense map of polymorphic markers, 
has considerably higher resolution than linkage methods. We 
describe here a composite likelihood estimate of locatiou for 
a disease gene against a high -resolution marker map by using 
allele frequencies at linked loci. Data may be family- based, as 
in the transmission disequilibrium test, or from a case-control 
study, x* tests, logarithm of odds standard errors and 
information weights are provided. The method is illustrated 
by analysis of published cystic fibrosis haplotypcs, in which 
AF508 is more accurately localized than by other association 
studies. This differs from current approaches by adopting a 
mure general Malecot model for isolation by distance, where 
distance here is between marker and disease locus, allowance 
for errors in the map and model, and freedom from assump- 
tions about demography, systematic pressures, and the ratio 
of physical to genetic distance. When these assumptions are 
introduced the number of generations since the original 
mutation may be estimated, but this is not required to 
determine location and its standard error, so that evidence 
from allelic association may be efficiently combined with 
linkage evidence to identify a region for positional cloning of 
a disease gene. 



Dependence of allele frequencies a I two loci is called allelic 
association, linkage disequilibrium, or gametic disequilibrium. 
We shall use the firsl term. Spurious allelic association is no I 
characteristic of the population, but is either a type 1 error or 
is induced by biased sampling or typing. Real allelic association 
can be confirmed in multiple samples. Allelic association 
mapping depends on the association of specific marker alleles 
with a disease mutation and the expectation of greater asso- 
ciation as the disease locus is approached. 'The strength of the 
association depends on pressure to disrupt haplolypes of 
linked loci by recombination and mutation and the effects of 
selection and drift. Data may be family-based or a case control 
study of individuals without close relationship. Linkage map- 
ping requires coscgregalion of marker and disease alleles 
within a family and can involve any allele at the marker locus. 
Allelic association provides a means to map genes for disease 
susceptibility that is independent of linkage evidence and, in 
favorable cases, has greater resolution. To exploit this we 
require an integrated map that combines genetic and physical 
evidence, an estimate of location on the same scale for linkage 
and association, and efficient weights by which they may be 
combined to give a single, optimal eslimaLe and test of 
significance that in principle are the same as for two linkage 
samples. 1 lere we show how such an analysis may be performed 
by the ALLASS program for testing, estimating, and mapping 
allelic association. all\ss is written in C and is available from 
http://cedar.genetics.soton.ac.uk/public_html/. 
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Association p 

Assuming that recombination dominates systematic pressure 
of mutation, selection, and long-range migration (with which 
it is confounded), the natural measure of allelic association is 

Pij - (I ~ %) r ~exp (-/%) [1] 

where 9$ is the recombination rale per gamete per generation 
between loci I and J, / is the number of generations during 
which the population has been approaching equilibrium, and 
p-,j is the (coefficient of) association between 1 and J (1). 
Neglecting stochastic variation because of finite population 
suce, the expected frequency of haplolypes with allele u at locus 
I and allele v at locus J is 

where g^, q v arc I he marginal gene frequencies (assumed 
constant in time) and (>uv was the corresponding haplotype 
frequency among founders / generations ago (u — 1,.., V and 
I' = 1,-, V). 

Attempts to apply this theory encounter the problems that 
the founder haplolypc frequencies f2 w arc unknown and the 
model is greatly simplified. Therefore, p has been neglected in 
favor of kinship <p, a metric based on £ with (V - 1) (V - 1) 
degrees of freedom that docs not require estimation ofQw (2). 
Usually the power of parsimonious models, even if approxi- 
mate, is greater than for models with many degrees of freedom 
(3). Illustrations of this principle in genetics include tests of 
Hardy-Weinberg equilibrium (4) and of oligogenic linkage (5). 
We Lhercfore conjectured thai maximum likelihood estimation 
of a single value of p for each marker locus, where applicable, 
would provide the most reliable inference about allelic asso- 
cialion and Lhercfore aboul Ihe location of disease genes. 
Models with multiple values of p, measuring the association 
between each marker allele and the disease locus, imply an 
equal number of unknown values of Q m , and no general theory 
has been developed that is biologically meaningful and effi- 
ciently estimable (6). However, the special case of a 2 X 2 
haplotype table has proven manageable and useful. 'Ihe cells 
of each table give counts for a given marker allele where a 
represents disease haplolypes with Lbe allele, b gives disease 
haplotypes without the allele, and c and d represent the 
corresponding normal haplolypes. Given a two- allele disease 
locus, the only practical problem is to reduce a U -allele marker 
locus to two alleles. 

Merging Associated Marker Alleles 

We reduce U 2 X 2 tables tor disease allele X marker allele by 
merging associated alleles through a stepwise process. The 
allele with the largest value of is taken to be associated, 



Abbreviations: CFTR, cystic fibrosis transmembrane conductance 
regulator: lod, logarithm of odds. 

*To whom reprint requests should be addressed, e-mail: arc<® 
soron.ac.uk. 



1741 



PACE 41/45 * RCVD AT 5/19/2008 4:07:52 PM [Eastern Daylight Time] * SVR:UBPTO-EFXRF<5/10 ■ DNIS:2738300 * CSlD:888-4 15-5987 * DURATION <mm-ss): 17-44 



From Robert McGinnis 1.888.415.5987 Mon May 19 14:08:05 2008 MST Page 42 of 45 



1742 



Medical Sciences: Collins and Morton 



Proc. Natl Acad. Sci. USA 95 (1998) 



whether x z significant or not. Thus, at each marker there is 
at least one allele in the ''associated'- class (and this association 
may be positive or negative). For each table, it* Lhc dclcrminanl 
ad — be is negative, this corresponds to a (possibly spurious) 
pro lec live marker allele, and only similar pro lee live alleles will 
be pooled into the associated class. A positive value oi ad — be 
corresponds to a (possibly spurious) susceptibility allele to be 
pooled wilh similar susceptibility alleles. Then for U > 2, the 
selected allele is excluded and the test is repeated on the 
remainder, only significant association being accepted. We 
define significance as xl *'iLh Yates* correction >5 wiLhoul a 
Bonferroni correction for the number of tests. In our experi- 
ence this gives an acceptable balance between type T and type 
II errors. ITiis process is continued until only one allele 
remains or no remain big allele is significant. Table 1 defines 
the final haplolype counts as 2 x 2 lables for each marker. 
Formally I his procedure is the same as has been used to 
designate founders in a phylogeny (7), but here the associated 
alleles are pooled into one class and The remaining alleles are 
pooled into the second class. When a marker has both posi- 
tively and negatively associated alleles, it is treated as two loci 
wilh lhc same location, one wilh posi lively associated alleles 
versus the rest, the other with negatively associated alleles 
versus the rest. In the latter case, a and b are interchanged, as 
are c and d, so lhal p > 0. As with any assumption, the equality 
of p for different alleles and for positive and negative associ- 
ations may be questioned. "Protective" marker alleles reflect 
hap lo types in which few disease mutations have occurred, but 
recombination is the same as for positively associated alleles at 
the same locus. For a di allelic locus the absolute values are 
equal. Although no model can include all possible deviations, 
the analysis makes allowance for errors by separating the 
estimation of p for each marker locus (Table 1) from its 
expected value. The disease frequency determines an enrich- 
ment factor to as lhc ratio of the number of cases lo controls 
divided by the ratio of disease frequency to normal in the 
population of haplotypes. introduction of co makes it unnec- 
essary to approximate the associated marker allele frequency 
R in the population by its frequency among controls (6). ITiis 
approximation is poor unless Q <sc R. 

In passing we make obvious extensions. If a quantitative trait 
is substituted for a disease dichotomy, the regression of the 
trait on the number 0, 1, or 2 of marker alleles is proportional 
to p. In the transmission disequilibrium test at least one parent 
is heterozygous for a marker allele associated with the disease. 
Therefore, the marker allele has frequency r = 0.5. The test 
uses only affected offspring, controls are omitted, and the 
transmission frequency from a marker heterozygote to af- 
fected children is (1 +' p)/2 (8). 

Tabic X, Haplorypc frequencies by population 



Location S D 

Because alleles ha%'e been dichotomized by disease association, 
we may simplify the notation by letting p. be the maximum 
likelihood estimate of association between disease and the z" ,b 
marker locus with information K- t given in the Appendix. 
Assuming lhal allelic association is declining from a higher 
level in founders, association plausibly follows the Malccot 
model for isolation by distance (1), 

p. = (1 -L)M exp (-edi) + L f3l 

The Malecot model was derived to describe kinship as a 
function of distance between populations. We adapt it here lo 
represent distance between marker and disease locus. The 
general characteristics of the Malecot model are illustrated in 
Fig. 1. The parameter M reflects a monophyletic or polyphyl- 
etic origin of susceptible haplotypes and is 1 if there is a unique 
susceptible haplotype and marker mutation is negligible, and 
less lhan 1 otherwise; e > 0 is dependent on the number of 
generations during which the haplotypes have been approach- 
ing equilibrium and lhc pressure lo disrupt them by recombi- 
nation, mutation, and perhaps selection; L is the bias due to 
spurious association in the sample resulting from the con- 
straint pi > 0, andc/j ^ 0 is the distance between disease locus 
and the i th marker locus (9). Departures from the model 
including mutational heterogeneity, errors in the map, dispro- 
portion between physical distance and recombination, failure 
to report nonsignificant values of />, and neglect of associated 
alleles other than the most significant can distort estimates of 
M and L. 

To apply the Malecot model we suppose that a small region 
contains m ordered markers Gi,.., G m and perhaps a disease 
locus O. The physical locations .S'i,-, S m of markers are assumed 
to be known without error. It is convenient to take the distance 
from marker i to the disease locus as di — 3, (S,- - 5d)> where 



*-{-? 



else 



[4] 



so that the derivative of the composite likelihood takes the 
appropriate sign. We assume that the Si are measured in Mb 
from Gi (so lhal S^ = 0). The logarilhmic likelihood of Lhc 
multiple pairwise observations summed over marker loci is 



lntk - K t (p t - pf /2 



[5] 



Goodness of fil is lesled by x z = _ 2 Inlk with ?n-n degrees of 
freedom, where m is the number of marker loci and n is the 
number of parameters estimated. The logarithm of odds (lod) 
for allelic association is derived from the difference between 
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For the *th marker locus all parameters arc subscripted by i. Inlk = a In fn + b In fi 2 + c In f 2 i + d In f 22 ; Q - disease gene frequency; R 
disease-associated marker allele frequency; tu - sample enrichment factor; p « association. 
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cant x 2 must be accommodated in the analysis, conventionally 
by the empirical information wc propose. 

Maximum likelihood estimates of 5 n and the significant 
nuisance parameters M, L, and f. give conditional information 
about location as K D = 1/K~ l ss, where ^T'^s & Ihe corre- 
sponding element in the covariance matrix. To oblain a 
maximum lilcelihood estimate of So, efficient combination 
with linkage as 2 Ko SnA Kr> is straightforward regardless of 
which is more informative (5). If residual x 7 significant, the 
corresponding Ko should be divided by * 2 /df. '1 his allowance 
for errors in the model is essential if evidence on linkage and 
allelic association is to be pooled and a minimal region is to be 
defined for positional cloning. 



— A Monophylctic Allele 



Poiyphyletic minor genes with a long history are a difficult and 
perhaps insuperable problem for disease mapping by allelic 
association unless the markers arc within a candidate locus. 
We therefore look llrsl at monophylctic major genes, which 
have a short history. Ihe cystic fibrosis transmembrane con- 
ductance regulator (CTFTR) locus that determines cystic fibro- 
sis is "the best example of the utility of linkage disequilibrium 
in mapping disease genes'" (10). The locus spans 250 kb 
between the restriction fragment length polymorphisms 
(RFLPs) D7S23 and D7S8 ( I I). On the map of Kerem et at. 
(12) CFI'K occupies the interval from U.7K Mb to 1.03 Mb 
distal to MET, with AF50S at position 0.8# (Table 2). Kerem 
et ul reported 23 RFLPs defining 77 haplotypes with AF508 
and 149 other haplolypes. To secure monophylelic origin wc 
merged non-AF508 alleles with Lhe control sample. Tsui (13) 
estimated the European gene frequency of AF508 as .014. 
These observations imply u> = (.986) (77)/(.OI4) (149) - 36.4. 
Other data have been reported on this interval, 'lhe am .ass 



Table 2. The CRTR region [12; 14] 
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2 table for each marker. 
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Mafecot model: M»0.75,L=0.1 
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Distance from disease locus (Mb) 

Fig. 1. Association is described as a function of distance from 
disease U> markeT locus in megabases and parameters, wiLh b reflecting 
the number of generations since the original m illation, M reflecting 
mono- or poiyphyletic origin of the mutation, and /. represenling.bias 
introduced by assuming at least one associated allele per marker. 
Curves illustrate the decline of association wiLh distance for a range of 
values of f. assuming A/ » 0.75 and /, = 0.1. 

lolal ^ (Table 2) and Xm-n for Ihe accepted model, which is 
itself a x 1 with n degrees of freedom (see Appendix). At this 
point objection could be raised that the terms in a composite 
likelihood (Cq. 5) are not independent but positively corre- 
lated, a fact neglected in other multiple pairwise analyses of 
allelic association. This tends to make the x 1 test conservative, 
given exact weights and an exact model. A nominally signifi- 
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program gives for each dataset and its specific r«> an interme- 
diate output with 5, p, K, and x 2 for each marker. These Tiles 
may be pooled, wilh partition of homogeneity x 2 by dalasct if 
overall heterogeneity is significant. To illustrate this approach 
we included the tliree intragenic microsatellites of Morral <>/ al. 
(14), ofwhich I VS8CA has negatively associated alleles (14, 15, 
16, 18) in addition to the positively associated ones (17, 23). By 
our convention this generates two markers at the same loca- 
tion. Estimates of association are consistent with surrounding 
RFLPs (Table 2). 

Association declines more rapidly distal to CFTR, with a 
650-kb gap before the three most distal markers. For all 27 
markers the best fit is at M = 1,L = 0, but a slightly smaller 
value of M and larger value of L are not excluded (Table 3). 
^F508 is positioned below its accepted location at 0.834 Mb 
(Table 4), but the difference when AF508 is positioned at the 
actual location (0.88) gives a xl = 5.38, which is not significant 
at the .001 level used by Terwiiliger (6) or the .01 level of 
Devlin el al. (15). Significance tests in multiple pairwise 
mapping are approximate. To explore this further wc made 
two other analyses. When the^tliree most proximal and most 
distal markers are omitted, xi is reduced to 3.77. When the 
number of markers is reduced to 13 by adopting the 9 regions 
of Kcrcm et al. (12), xi is 2.75. The effect on the estimate of 
location is very small and x 2 values for the various hypotheses 
and datasets correspond quite well with degrees of freedom. In 
no analysis is tlie estimate of M less than 1 nor the estimate of 
L significantly different from 0. We expect M to be 1 for a 
monophyletic allele andZ. to be small. Because the expected 
value of is I on the null hypothes is, th e bias induced by 
taking p to be_posilive is about V2/-7T/VS for diallelic 
markers, where K is the mean value of K per marker. In this 
example the bias is .050. When M is 1 and k is estimated, 
virtually identical values of are obtained for 1. = 0 and .050. 

The lod Zi for allelic association, calculated as in the 
Appendix, is similar in the three analyses and overwhelmingly 
significant (Table 4). It dwarfs the evidence on location from 
linkage, which was necessary but not sufficient for positional 
cloning. The interval between MET and D7S8 was too small 
for reliable mapping by linkage at the time when CF was 
recognized through recessive disease, hence the interest in 
developing allelic association to localize the gene. By allelic 
association Terwiiliger (6) placed AF508 at 0.77 Mb, with a 
13.8 support interval for x 2 corresponding to a lod of 3 from 
0.69 to 0.87, overlapping the CFl'R locus but not including 
AF508. Devlin et al. (15) localized AF508 at 0.81 Mb. Using a 
subset of the Kerem sample, Xiong and Guo (16) estimated 
error by their method as 75 kb. Using the same subset of the 
data by this method gives an identical error. The capability of 
allass to pool different studies allows greater precision. For 
the combined Kerem and Morral samples we place at 
0.834 Mb (Table 4), within 50 kb of its physical location. 

Discussion 

In lie local ion database tdb Ihe sex-average distance between 
MET and D7S8 is 0.8 cM (17), compared with a physical 

Tabic 3. The AF50S allele of CFl'R: Tests of hypotheses 



Table 4. The AF508 allele ot CFTR: Estimates of lods, 
parameters, and information 
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Parameters e, L, and 5 estimated, M = 1. Zi, lod corresponding to 
XS for association; So, estimated location ot AF508; Ko, information 
about location. 

distance of 1.67 Mb, "Ine ratio z is twice as great as the rule of 
thumb that equates 1 Mb to 1 cM. The estimated duration of 
AF508 is 100 7«, or 209 generations, but this would be an 
underestimate if the allele persisted for a long lime in a small 
population that later expanded. The highest frequency of 
AF508 is found north of the Alps in the region settled by Celtic 
aud Germanic tribes, but substantial frequencies occur iu 
Turkey, Russia, and Israel, suggesting dispersal during the 
Neolithic as proposed by Serre et at. (18). Our estimated 
duration, although obtained by an entirely different method, is 
in close agreement with their estimate of 100-200 generations. 
Morral e.t al. (14) estimated a duration an order of magnitude 
greater at 2,627 generations, assuming a gametic mutation rate 
of 3.3 x 10~ A or less. If the ancestral haplotype was 17 31-13, 
the frequency of substitutions is .513, .330, and .021- Neglect- 
ing multiple substitutions and recombination, the number of 
generations al the assumed mutation rale is 1,555, 1,000, and 
63, 'lhese estimates are variable, the gametic mutation rate is 
uncertain, and neglect of recombination and selection may not 
be justified. The highly significant value of « in the pooled data 
is evidence that recombination is of greater magnitude than 
mutation over the interval from MET to D7S8. Allelic asso- 
ciation gives much less information about the age of AF508 
than about its location. 

Terwiiliger (6) applied multiple pairwise analysis to condi- 
tional likelihood when a single, positively associated allele is 
specified a priori at each marker locus, lie assumed that all 
markers were positioned exactly on a genetic map that could 
be equated to a physical map by the 1 cM = 1 Mb rule of 
thumb. The problem of testing for association and the resulting 
bias L were not addressed, and negative associations were 
excluded. Multiple associated alleles were considered in Table 
3, which does not model approach to equilibrium under 
recombination. Because no lest was provided for goodness of 
fit, there was no allowance for errors in the model. 

Devlin et al. (15) drew attention to the fact that multiple 
pairwise mapping (19) uses composite likelihood for which 
useful mathematical theory has been developed (20). They 
assume two alleles a I each marker locus, but do not consider 
how a larger number could be dichotomized. They introduce 
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Parameters fixed by hypothesis arc given. Values of estimated parameters (*;, S, and L) arc not shown. goodness of fit 
to Malccot model with df (degrees of freedom). 
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the approximation R — O ~ R and assume /. = 0, M — 1 to 
approximate k with no test for errors in the model. We allow 
explicitly for ease-control sampling and make minimal evolu- 
tionary assumptions. Perhaps as a consequence, there is no 
evidence of heterogeneity in this example. 

Sham and Curtis (21) introduced Monte Carlo tests for 
disease association with alleles at a single marker locus, lliey 
recognized that alleles should be combined in a way that 
preserves the evidence for association. Xiong and Guo (16) 
developed ingenious composite likelihood methods that in- 
corporate parameters for mutational age, population growth, 
and recurrent mutation, unfortunately not known with any 
precision. When the physical location is given, ad hoc assump- 
tions can be introduced to improve the estimate from allelic 
association. Tn the more relevant case of unspecified physical 
location, there is little basis for choice of unknown parameters 
that may make the estimate from allelic association better or 
worse. Testing for associated alleles, the difference between 
genetic and physical maps, and allowance for errors in the 
model arc not considered. They gave several examples in which 
their method worked beller lhan earner methods. For the 
CFTR locus their estimated error using 19 markers selected 
from the reported 23 (12) was 75 kb. Using the same subset of 
the data with our method we obtain exactly the same error. 
With the full set of 27 markers the error is reduced to only 46 
kb. 

We have not yet attempted to map a disease locus in complex 
in heri lance, where marker gene frequencies in cases and 
controls provide reduction to 2 X 2 tables but the locus cannot 
be hap lo typed. 'ITris must be a severe constraint on the power 
of allelic association, as is the small interval in wliich allelic 
association can sometimes be detected (2). Efficient combi- 
nation with linkage allows the same family material to be used 
for both tests. Although isolated cases are easier to collect than 
familial cases, they are more likely to be phenoeopies and are 
usually less informative for linkage. 

The lod score required for reliable detection of a candidate 
locus, which is as much as 9 when each marker locus is tested 
individually (22), is minimized by partitioning the genome into 
regions of 10 or more megabases (Mb), within which only a 
single candidate is sought. ITien there is only one degree of 
freedom for disease location, regardless of the total number of 
alleles in the region. If markers are sufficiently dense, a 
combination of few tests and high power justifies the canonical 
lod of 3, and evidence from linkage and allelic association may 
be used to give a single, optimal location and lest of signifi- 
cance. It remains lo be seen how this approach performs with 
multiple disease mutations and complex inheritance. 

Appendix: Numerical Analysis 

In Table 1 let U y = a lnlk/dy for y = R, P> with 
corresponding information matrix [k^] that reflects sampling 
from the current population but not drift over generations. 
Newton-Raphson iteration gives p. Under H 0 the score for p 
is U= (ad bc)n/(a I c)(c 1 d) with conditional information 
K = n (a + b) (b + cT)/(a + c)(c + d) r where n = a + b + c 
+ d and p = U/K, and lP/K is the usual ^ for a 2 X 2 
contingency table. An apparently significant x 2 is reduced by 
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Yates' correction, deducting n/2 from |ad-bc| . Ti-ial values 
arc Q 0 - (a + b)/[(<*> - 1) (c + d) + «], pu - (ad - bc)/{a 
+ b) d and R 0 = c/(c + d). At p - I only R is estimated. 
Because of the instability of k w ? the information K\ about p for 
the i fh marker is taken as the lesser of K and x 2 /P 2 - 

For Eq. 5 the information matrix is calculated by exact 
second derivatives after convergence under a variable metric 
algorithm (23). 

To compute the lod Xi with 1 degree of freedom that has the 
same significance level as x 2 Wltn m degrees of freedom a 
numerical recipe to obtain the corresponding probability /> (23) 
was modified to return the natural logarithm (In p), and the 
Hastings approximation to the corresponding normal deviate 
X P was used with 

/ = v '-2 In (p/2) (ref. 24, equation 26.2.23). 

Then Zi =4/(2 In 10). 

1. Malccot, <j. (19 18) i<es Mathemahques de VHtridirt (Maison ct 
Cie, Paris). 

2. Morton, N. E. & Wu, D. (1088) Am. J. Hum. Genet. 42, 173 1 77. 

3. Agresli, A. (1990) Categorical Data Anafysis (Wiley. New York). 

4. Morton, N. E. (1997) Rcvista di Antropologia 74, 1-9. 

5. Lio, P. & Morton, N. E. (1997) Proc Nad. Acad. Set. USA 94, 
5344-5348. 

6. Terwilliger, J. O. (1995) Am. J. Hum. Genet. 56, 777-787. 

7. Morton, N. E., Lew, R., I hisscls, I. E. & Little, G. F. (1972) Am. J. 
Hum. Cent:. 24, 277-289. 

8. Spklman, R. S., McGinnis, R. E. &. Ewcns, W. J. (1993) Am. J. 
Hum. Gene:. 52, 506-516. 

9. Morton, N. E.. Klein, D., I Iusscls, I. EL, Dodinval, I\, Todorov, 
A, Lew, R. & Yee, S. (1973) Am. J. Hum. Genet. 25, 347-361. 

10. Kaplan, N. I ,., Hill, W. O. & Weir, B. S. (\995)A,n.J. Hum. (tenet. 
S6, 18-32. 

11. Aiiaud, R., Ogilvic. D. J.. Butler. R., Riley, J. H., Finnicar, R. S., 
Powell, S. J.. Smith, J. C. & Markham, A. F. (1991) Genomics % 
124-130. 

12. Kerem, B., Koinmens, J. S., Buchanan, J. Markiewicz, D., 
Cox. T. K., Chakravarli, A., Buchwald, M. <fc Tsui, L.-C 0989) 
Science 245. 1073-1080. 

13. Tsui, L.-C. (1992) Hum. Mut. 1, 197-203. 

14. Morral, N., Bcrtranpcrit, J., Estivill, X., Nuncs, V., Casals, T., et 
id. (1994) Nat. Genet. 7. 169-175. 

15. Devlin, B., Risch, N. <fc Rocdcr, K. (1996) Genomics 36, 1-16. 

16. Xiong, M- &. Guo, S.-W. (1997) Am. J. Hum. Genet. 60, 1513- 
1531. 

17. Collins, A., Fre/al, J., league, J. & M<irlon, N. K. (1996) froc. 
i\a:L Acad. Sci. USA 93, 14771-14775. 

18. Sc rrc, J. L., Simon-Bony, B., MorrcT, E., Jaumc-Roig, B., Balas- 
sopoulou, A.,Schwartz,'M. &Taillandcr, A. (1990) Hum. Getiet. 
84, 449-454. 

19. Mortoji, N. i£. (1978) Human Gene Mapping 4 (1977): fourth 
International Workshop rm Human (iene Mapping (S. Karger, 
Basel), pp. 15-36. - 

20. Undsav, B. (i. H988) Contemporary Mathematics 80, 221-239. 

21. Sham, P. C. & Curlis, D. 0995) Ann. Hum. Gene!. 59, 97-105. 

22. Risch, N. & Mcrikangas, K. fl996) Scie/tce 273, 1516-1517. 

23. Press, W. H., Tcukolsky. S. A., Vcncrluig., W. T. & Flanncn', B. P. 
(1992) Numerical Recipes in C (Cambridge Univ. Press, Cain- 
bridge, U.K.), 2nd Ed. 

24. AbramowiL/, Nf. & Siegun, A. (1 965) Handbook of Mathematical 
Functions (Dover, New York). 



PACE 45/45 * RCVD AT 5/19/2008 4:07:52 PM [Eastern Daylight Time] ■ SVR:USPTO-EFXRF-5/10 « DNIS:2738300 * CSID: 888-4 15-5987 * DURATION (irim-ss): 17-44 



